Educative: Interactive Courses for Software Developers

We still have to account for multiple values in the attribute, but we know that the new solution must store only a single value in each column. It may seem natural to create multiple columns in this table, each containing a single tag.

Enter to Rename, Shift+Enter to Preview

Creating Bugs table

As we assign tags to a given bug, we’d put values in one of these three columns. Unused columns would remain null.

Enter to Rename, Shift+Enter to Preview

Updating Bugs table

Let’s try to retrieve the Bugs table data after updating the value of tag2. Press “RUN” in the following playground and see the output.

Enter to Rename, Shift+Enter to Preview

Retrieving data after updating Bugs table

Tasks we could do easily with a normal attribute are now more complex.

Searching for values#

When searching for bugs with a given tag, we must search all three columns because the tag string could occupy any of these columns.

If we’re searching for a given value across multiple columns, it is a clue that the multiple columns should be stored as a single logical attribute.

For example, to retrieve bugs that reference performance, we can use a query like this:

Enter to Rename, Shift+Enter to Preview

Retrieving bugs having performance tag

We may need to search for bugs that reference both tags, i.e., performance and printing. To do this, we use a query like the following one. It is important to use the parentheses correctly because OR has lower precedence than AND.

Enter to Rename, Shift+Enter to Preview

Retrieving bugs having performance and printing tag values

The syntax required to search for a single value over multiple columns is lengthy and tedious to write. We can make it more compact by using an IN predicate in a slightly untraditional manner:

Enter to Rename, Shift+Enter to Preview

Retrieving bugs with tags using the IN keyword for performance and printing

Adding and removing values#

Adding or removing a value from a set of columns presents its own issues. Simply using UPDATE to change one of the columns isn’t safe since we can’t be sure which column is unoccupied, if any. We may have to retrieve the row into our application to see.

Enter to Rename, Shift+Enter to Preview

Retrieving data before updating something

In this case, for instance, the result shows that tag2 is null. Now, we can form the UPDATE statement.

Enter to Rename, Shift+Enter to Preview

Updating a tag for bug_id 3456

Let’s retrieve the results after writing the query in the following playground.

Enter to Rename, Shift+Enter to Preview

Retrieving data after updating the tag

We face the risk of another client going through the same steps of reading the row and updating it before we have updated the table after quering it. Depending on who applied their update first, either we or the other client risk getting an update conflict error or having their changes overwritten by the other. We can avoid this two-step query by using complex SQL expressions.

The following statement uses the NULLIF() function to make each column null if it equals a specific value. NULLIF() returns null if its two arguments are equal.

Enter to Rename, Shift+Enter to Preview

Updating Bugs table to set NULL in place of the specific value using NULLIF function

Let’s retrieve the data after updating the table.

Enter to Rename, Shift+Enter to Preview

Retrieving the records after updating the Bugs table

The following statement adds the new tag performance to the first column that is currently null. However, if none of the three columns is null, the statement makes no change to the row, and the new tag value is not recorded. Constructing this statement is laborious. Notice also that we must repeat the string performance six times.

Enter to Rename, Shift+Enter to Preview

Updating Bugs table by adding a value in the first attribute having no value

Let’s retrieve the data after using UPDATE in a query.

UPDATE Bugs
SET tag1 = CASE
WHEN 'performance' IN (tag2, tag3) THEN tag1
ELSE COALESCE(tag1, 'performance') END,
tag2 = CASE
WHEN 'performance' IN (tag1, tag3) THEN tag2
ELSE COALESCE(tag2, 'performance') END,
tag3 = CASE
WHEN 'performance' IN (tag1, tag2) THEN tag3
ELSE COALESCE(tag3, 'performance') END
WHERE bug_id = 3456;
SELECT * FROM Bugs WHERE bug_id = 3456;

Enter to Rename, Shift+Enter to Preview

Retrieving Bugs table after updating the records

Ensuring uniqueness#

We probably don’t want the same value to appear in multiple columns, but the database can’t prevent this when we use the Multi-column Attributes antipattern. In other words, it’s hard to prevent the following statement:

Enter to Rename, Shift+Enter to Preview

Inserting a new record in the Bugs table

Let’s retrieve the data in the following playground to see the results after using the INSERT statement.

Enter to Rename, Shift+Enter to Preview

Retrieving the Bugs table data after inserting a new record

Handling growing sets of values#

Another weakness of this design is that three columns might not be enough. To keep the design of one value per column, we must define as many columns as the maximum number of tags a bug can have. How can we predict, at the time we define the table, what that greatest number will be?

One tactic is to guess at a moderate number of columns and expand later, if necessary, by adding more columns. Most databases allow us to restructure existing tables to add Bugs.tag4 or even more columns as we need them.

Enter to Rename, Shift+Enter to Preview

Adding a new column in the Bugs table

However, this change is costly in three ways:

Restructuring a database table that already contains data may require locking the entire table and blocking access for other concurrent clients.
Some databases implement this kind of table restructure by defining a new table to match the desired structure, copying the data from the old table, and then dropping the old table. If the table in question has a lot of data, this transfer can take a long time.
When we add a column in the set for a multi-column attribute, we must revisit every SQL statement in every application that uses this table, editing the statement to support new columns.

Enter to Rename, Shift+Enter to Preview

Retrieving the records which contain value performance in any of its attribute

This is a meticulous and time-consuming development task. If we miss any queries that need edits, it can lead to bugs that are difficult to detect.

Introduction

Logical Antipattern - Jaywalking

Logical Antipattern - Naive Trees

Logical Antipattern - ID Required

Logical Antipattern - Keyless Entry

Assessment: Logical Antipatterns - Part 1

Logical Antipattern - Entity-Attribute-Value

Logical Antipattern - Polymorphic Associations

Logical Antipattern - Multicolumn Attributes

Logical Antipattern - Metadata Tribbles

Assessment: Logical Antipatterns - Part 2

Physical Antipattern - Rounding Errors

Physical Antipattern - 31 Flavors

Physical Antipattern - Phantom Files

Physical Antipattern - Index Shotgun

Assessment: Physical Antipatterns

Query Antipattern - Fear of the Unknown

Query Antipattern - Ambiguous Groups

Query Antipattern - Random Selection

Query Antipattern - Poor Man’s Search Engine

Query Antipattern - Spaghetti Query

Query Antipattern - Implicit Columns

Assessment: Query Antipattern

Application Development Antipattern - Readable Passwords

Application Development Antipattern - SQL Injection

Application Development Antipattern - Pseudokey Neat-Freak

Application Development Antipattern - See No Evil

Application Development Antipattern - Diplomatic Immunity

Application Development Antipattern - Magic Beans

Assessment: Application Development Antipattern

Conclusion

Antipattern: Create Multiple Columns

Searching for values#

Adding and removing values#

Ensuring uniqueness#

Handling growing sets of values#